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Recent application of the Bayesian algorithm BORG to the Sloan Digital Sky Survey (SDSS) main sample 
galaxies resulted in the physical inference of the formation history of the observed large-scale structure from 
its origin to the present epoch. In this work, we use these inferences as inputs for a detailed probabilistic 
cosmic web-type analysis. To do so, we generate a large set of data-constrained realizations of the large-scale 
structure using a fast, fully non-linear gravitational model. We then perform a dynamic classification of the 
cosmic web into four distinct components (voids, sheets, filaments, and clusters) on the basis of the tidal field. 
Our inference framework automatically and self-consistently propagates typical observational uncertainties 
to web-type classification. As a result, this study produces accurate cosmographic classification of large-scale 
structure elements in the SDSS volume. By also providing the history of these structure maps, the approach 
allows an analysis of the origin and growth of the early traces of the cosmic web present in the initial 
density field and of the evolution of global quantities such as the volume and mass filling fractions of different 
structures. For the problem of web-type classification, the results described in this work constitute the first 
connection between theory and observations at non-linear scales including a physical model of structure 
formation and the demonstrated capability of uncertainty quantification. A connection between cosmology 
and information theory using real data also naturally emerges from our probabilistic approach. Our results 
constitute quantitative chrono-cosmography of the complex web-like patterns underlying the observed galaxy 
distribution. 


I. INTRODUCTION 

The large-scale distribution of matter in the Universe 
is known to form intricate, complex patterns traced 
by galaxies. The existence of this large-scale structure 
(LSS), also known as the cosmic web (Bond, Kofman 
& Pogosyan, 1996), has been suggested by early obser¬ 
vational projects aiming at mapping the Universe (Gre¬ 
gory & Thompson, 1978; Kirshner et al ., 1981; de Lap- 
parent, Geller & Huchra, 1986; Geller & Huchra, 1989; 
Shectman et al ., 1996), and has been extensively ana¬ 
lyzed since then by massive surveys such as the 2dFGRS 
(Colless et al ., 2003), the SDSS (e.g. Gott et al ., 2005) 
or the 2MASS redshift survey (Huchra et al ., 2012). The 
cosmic web is usually segmented into different elements: 
voids, sheets, filaments, and clusters. At late times, low- 
density regions (voids) occupy most of the volume of the 
Universe. They are surrounded by walls (or sheets) from 
which departs a network of denser filaments. At the in¬ 
tersection of filaments lie the densest clumps of matter 
(clusters). Dynamically, matter tends to flow out of the 
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voids to their compensation walls, transits through fila¬ 
ments and finally accretes in the densest halos. 

Describing the cosmic web morphology is an involved 
task due to the intrinsic complexity of individual struc¬ 
tures, but also to their connectivity and the hierarchical 
nature of their global organization. First approaches (e.g. 
Barrow, Bhavsar & Sonoda, 1985; Gott, Dickinson & 
Melott, 1986; Babul & Starkman, 1992; Mecke, Buchert 
& Wagner, 1994; Sahni, Sathyaprakash & Shandarin, 
1998) often characterized the LSS with a set of global 
and statistical diagnostics, without providing a way to 
locally identify cosmic web elements. In the last decade, 
a variety of methods has been developed for segmenting 
the LSS into its components and applied to numerical 
simulations and observations. Among them, some focus 
on investigating one component at a time, in particu¬ 
lar filaments (e.g. the Candy model - Stoica et al ., 2005; 
Stoica, Martinez & Saar, 2007, 2010, the skeleton analysis 
- Novikov, Colombi & Dore, 2006; Sousbie et al ., 2008, 
and DisPerSE - Sousbie, 2011; Sousbie, Pichon & Kawa- 
hara, 2011) or voids (e.g. Plionis & Basilakos, 2002; Col- 
berg et al ., 2005; Shandarin et al ., 2006; Platen, van de 
Weygaert & Jones, 2007; Neyrinck, 2008; Sutter et al ., 
2015; Elyiv et al ., 2015, see also Colberg et al ., 2008 for 
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a void finder comparison project). Unfortunately, this 
approach does not allow an analysis of the connections 
between cosmic web components, identified in the same 
framework. Another important class of web classifiers 
dissects clusters, filaments, walls, and voids at the same 
time. In particular, several recent studies deserve special 
attention due to their methodological richness. The “T- 
web” and “V-web” (Hahn et al ., 2007a; Forero-Romero 
et al ., 2009; Hoffman et al ., 2012) characterize the cos¬ 
mic web based on the tidal and velocity shear fields. 
diva (Lavaux & Wandelt, 2010) rather uses the shear 
of the Lagrangian displacement field. ORIGAMI (Falck, 
Neyrinck & Szalay, 2012) identifies single and multi¬ 
stream regions in the full six-dimensional phase-space 
information (Abel, Hahn & Kaehler, 2012; Neyrinck, 
2012; Shandarin, Habib & Heitmann, 2012). The Multi¬ 
scale Morphology Filter (Aragon-Calvo et al ., 2007) and 
later refinements nexus/nexus+ (Cautun, van de Wey- 
gaert & Jones, 2013) follow a multiscale approach which 
probes the hierarchical nature of the cosmic web. 

In the standard theoretical picture, the cosmic web 
arises from the anisotropic nature of gravitational col¬ 
lapse, which drives the formation of structure in the Uni¬ 
verse from primordial fluctuations (Peebles, 1980). The 
capital importance of the large-scale tidal field in the for¬ 
mation and evolution of the cosmic web was first pointed 
out in the seminal work of Zel’dovich (1970). In the 
Zel’dovich approximation, the late-time morphology of 
structures is linked to the eigenvalues of the tidal tensor 
in the initial conditions. Gravitational collapse amplifies 
any anisotropy present in the primordial density field to 
give rise to highly asymmetrical structures. This picture 
explains the segmented nature of the LSS, but not its 
connectivity. The cosmic web theory of Bond, Kofman 
& Pogosyan (1996) asserted the deep connection between 
the tidal field around rare density peaks in the initial 
fluctuations and the final web pattern, in particular the 
filamentary cluster-cluster bridges. More generally, the 
shaping of the cosmic web through gravitational clus¬ 
tering is essentially a deterministic process described by 
Einstein’s equations and the main source of stochasticity 
in the problem enters in the generation of initial con¬ 
ditions, which are known, from inflationary theory, to 
resemble a Gaussian random field to very high accuracy 
(Guth & Pi, 1982; Hawking, 1982; Bardeen, Steinhardt 
& Turner, 1983). For these reasons, considerable effort 
has been devoted to a theoretical understanding of the 
LSS in terms of perturbation theory in the Eulerian and 
Lagrangian frames (for a review, see Bernardeau et al ., 
2002). While this approach offers important analytical 
insights, it only permits to describe structure formation 
in the linear and mildly non-linear regimes and it is usu¬ 
ally limited to the first few correlation functions of the 
density field. The complete description of the connec¬ 
tion between primordial fluctuations and the late-time 
LSS, including a full phase-space treatment and the en¬ 
tire hierarchy of correlators, has to rely on a numeri¬ 
cal treatment through IV-body simulations. The charac¬ 


terization of cosmic web environments in the non-linear 
regime and the description of their time evolution has 
only been treated recently, following the application of 
web classifiers to state-of-the-art simulations. In particu¬ 
lar, Hahn et al. (2007a); Aragon-Calvo, van de Weygaert 
& Jones (2010) presented a local description of structure 
types in high-resolution cosmological simulations. Hahn 
et al. (2007b); Bond, Strauss & Cen (2010); Cautun 
et al. (2014) analyzed the time evolution of the cosmic 
web in terms of the mass and volume content of web-type 
components, their density distribution, and a set of new 
analysis tools especially designed for particular elements. 

To the best of our knowledge, neither the classification 
of cosmic environments at non-linear scales in physical re¬ 
alizations of the LSS nor the investigation of their genesis 
and growth, using real data and with demonstrated ca¬ 
pability of uncertainty quantification, have been treated 
in the existing literature. In this work, we propose the 
first probabilistic web-type analysis conducted with ob¬ 
servational data in the deeply non-linear regime of LSS 
formation. We build accurate maps of dynamic cosmic 
web components with a resolution of around 3 Mpc/ft, 
constrained by observations. In addition, our approach 
leads to the first quantitative inference of the formation 
history of these environments and allows the construc¬ 
tion of maps of the embryonic traces in the initial per¬ 
turbations of the late-time morphological features of the 
cosmic web. 

Cosmographic descriptions of the LSS in terms of 
three-dimensional maps, and in particular a dynamic 
structure type cartography carry potential for a rich 
variety of applications. Such maps characterize the 
anisotropic nature of gravitational structure formation, 
the clustering behavior of galaxies as a function of their 
tidal environment and permit to describe the traces of 
the cosmic web already imprinted in the initial condi¬ 
tions. So far, most investigations focused on understand¬ 
ing the physical properties of dark halos and galaxies in 
relation to the LSS. Hahn et al. (2007a,b, 2009); Hahn, 
Angulo & Abel (2014); Aragon-Calvo, van de Weygaert 
& Jones (2010) found a systematic dependence of halo 
properties such as morphological type, color, luminosity 
and spin parameter on their cosmic environment (local 
density, velocity and tidal field). In addition, a correla¬ 
tion between halo shapes and spins and the orientations 
of nearby filaments and sheets, predicted in simulations 
(Altay, Colberg & Croft, 2006; Hahn et al ., 2007a,b, 
2009; Paz, Stasyszyn & Padilla, 2008; Zhang et al ., 
2009; Codis et a/., 2012; Libeskind et a/., 2013; Welker 
et a/., 2014; Aragon-Calvo & Yang, 2014; Laigle et al ., 
2015), has been confirmed by observational galaxy data 
(Paz, Stasyszyn & Padilla, 2008; Jones, van de Wey¬ 
gaert & Aragon-Calvo, 2010; Tempel, Stoica & Saar, 
2013; Zhang et al ., 2013). Cartographic descriptions 
of the cosmic web also permit to study the environmen¬ 
tal dependence of galaxy properties (see e.g. Lee & Lee, 
2008; Lee & Li, 2008; Park, Kim & Park, 2010; Yan, 
Fan & White, 2012; Kovac et a/., 2014) and to make 
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the connection between the sophisticated predictions for 
galaxy properties in hydrodynamic simulations (e.g. Vo- 
gelsberger et al ., 2014; Dubois et al., 2014; Codis et al ., 
2015) and observations. Another wide range of appli¬ 
cations of structure type reconstructions is to probe the 
effect of the inhomogeneous large-scale structure on pho¬ 
ton properties and geodesics. For example, it is pos¬ 
sible to interpret the weak gravitational lensing effects 
of voids (Melchior et al ., 2014; Clampitt & Jain, 2014). 
Dynamic information can also be used to produce predic¬ 
tion templates for secondary effects expected in the cos¬ 
mic microwave background such as the kinetic Sunyaev- 
Zel’dovich effect (Li et al ., 2014), the integrated Sachs- 
Wolfe and Rees-Sciama effects (e.g. Cai et al ., 2010; Ilic, 
Langer & Douspis, 2013; Planck Collaboration, 2014a). 

Building such refined cosmographic descriptions of 
the Universe requires high-dimensional, non-linear infer¬ 
ences. In Jasche, Leclercq & Wandelt (2015) (borg SDSS 
in the following), we presented a chrono-cosmography 
project, aiming at reconstructing simultaneously the den¬ 
sity distribution, the velocity field and the formation his¬ 
tory of the LSS from galaxies. To do so, we used an 
advanced Bayesian inference algorithm to assimilate the 
Sloan Digital Sky Survey (SDSS) DR7 data into the fore¬ 
casts of a physical model of structure formation (second 
order Lagrangian perturbation theory - 2LPT). Besides 
inferring the four-dimensional history of the matter dis¬ 
tribution, these results permit us an analysis of the gene¬ 
sis and growth of the complex web-like patterns that have 
been observed in our Universe. Therefore, this work con¬ 
stitutes a new chrono-cosmography project, aiming at the 
analysis of the evolving cosmic web. 

Our investigations rely on the inference of the initial 
conditions in the SDSS volume (borg SDSS). Starting 
from these, we generate a large set of constrained real¬ 
izations of the Universe using the COLA method (Tassev, 
Zaldarriaga & Eisenstein, 2013). This physical model al¬ 
lows us to perform the first description of the cosmic web 
in the non-linear regime, using real data, and to follow 
the time evolution of its constituting elements. Through¬ 
out this paper, we adopt the Hahn et al. (2007a) dynamic 
web classifier, which segments the LSS into voids, sheets, 
filaments, and clusters. This choice is motivated by the 
close relation between the equations that dictate the dy¬ 
namics of the growth of structures in the Zekdovich for¬ 
malism and the Lagrangian description of the LSS which 
naturally emerges with BORG. As this procedure relies 
on the estimation of the eigenvalues of the tidal tensor 
in Fourier space, it constitutes a non-linear and non-local 
estimator of structure types, requiring adequate means to 
propagate observational uncertainties to finally inferred 
products (web-type maps and all derived quantities), in 
order not to misinterpret results. The BORG algorithm 
naturally addresses this problem by providing a set of 
density realizations constrained by the data. The varia¬ 
tion between these samples constitute a thorough quan¬ 
tification of uncertainty coming from all observational 
effects (in particular the incompleteness of the data be¬ 


cause of the survey mask and the radial selection func¬ 
tions, as well as luminosity-dependent galaxy biases, see 
BORG SDSS for details), not only with a point estimation 
but with a detailed treatment of the likelihood. Hence, 
for all problems addressed in this work, we get a fully 
probabilistic answer in terms of a prior and a posterior 
distribution. Building upon the robustness of our uncer¬ 
tainty quantification procedure, we are able to make the 
first observationally-supported link between cosmology 
and information theory (see Neyrinck, 2014, for theoreti¬ 
cal considerations related to this question) by looking at 
the entropy and Kullback-Leibler divergence of probabil¬ 
ity distribution functions. 

This paper is organized as follows. In section II, we 
describe our methodology: Bayesian large-scale structure 
inference with the BORG algorithm, non-linear filtering 
of samples with COLA and web-type classification using 
the Hahn et al procedure. In sections III and IV, we 
describe the cosmic web at present and primordial times, 
respectively. In section V, we follow the time evolution 
of web-types as structures form in the Universe. Finally, 
we summarize our results and offer concluding comments 
in section VI. 


II. METHODS 

In this section, we describe our methodology step by 
step: 

1. inference of the initial conditions with BORG (sec¬ 
tion II A), 

2. generation of data-constrained realizations of the 
SDSS volume via non-linear filtering of BORG sam¬ 
ples with COLA (section IIB), 

3. classification of the cosmic web in voids, sheets, fil¬ 
aments, and clusters, using the Hahn et al. algo¬ 
rithm (section IIC). 


A. Bayesian large-scale structure inference with BORG 

This work builds upon results previously obtained by 
application of the BORG (Bayesian Origin Reconstruc¬ 
tion from Galaxies, Jasche & Wandelt, 2013) algorithm 
to Sloan Digital Sky Survey data release 7 data (borg 
SDSS). BORG is a full-scale Bayesian inference code which 
permits to simultaneously analyze morphology and for¬ 
mation history of the cosmic web. 

As discussed in Jasche & Wandelt (2013), accurate 
and detailed cosmographic inferences from observations 
require modeling the mildly non-linear and non-linear 
regime of the presently observed matter distribution. 
The exact statistical behavior of the LSS in terms of a full 
probability distribution function (pdf) for non-linearly 
evolved density fields is not known. For this reason, 
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the first full-scale reconstructions relied on phenomeno¬ 
logical approximations, such as multivariate Gaussian 
or log-normal distributions, incorporating a cosmologi¬ 
cal power-spectrum to accurately represent correct two- 
point statistics of density fields (see e.g. Lahav et ah, 
1994; Zaroubi, 2002; Erdogdu et a/., 2004; Kitaura & 
EnBlin, 2008; Kitaura et al ., 2009; Kitaura, Jasche & 
Metcalf, 2010; Jasche & Kitaura, 2010; Jasche et al ., 
2010b, a). However, these prescriptions only model the 
one and two-point statistics of the matter distribution. 
Additional statistical complexity of the evolved density 
field arises from the fact that gravitational structure for¬ 
mation introduces mode coupling and phase correlations. 
This manifests itself not only in a sheer amplitude differ¬ 
ence of density and velocity fields at different redshifts, 
but also in a modification of their statistical behavior by 
the generation of higher-order correlation functions. An 
accurate modeling of these high-order correlators is of 
crucial importance for a precise description of the connec¬ 
tivity and hierarchical nature of the cosmic web, which 
is the aim of this paper. 

While the statistical nature of the late-time density 
field is poorly understood, the initial conditions from 
which it formed are known to obey Gaussian statistics 
to very great accuracy (Planck Collaboration, 2014b). 
Therefore, it is reasonable to account for the increasing 
statistical complexity of the evolving matter distribution 
by a dynamical model of structure formation linking ini¬ 
tial and final conditions. This naturally turns the prob¬ 
lem of LSS analysis to the task of inferring the initial con¬ 
ditions from present cosmological observations (Jasche 
& Wandelt, 2013; Kitaura, 2013; Wang et al ., 2013). 
This approach yields a very high-dimensional and non¬ 
linear inference problem. Typically, the parameter space 
to explore comprises on the order of 10 6 to 10 7 elements, 
corresponding to the voxels of the map to be inferred. 
For reasons linked to computational cost, the BORG al¬ 
gorithm employs second order Lagrangian perturbation 
theory (2LPT) as an approximation for the actual gravi¬ 
tational dynamics linking initial three-dimensional Gaus¬ 
sian density fields to present, non-Gaussian density fields. 
As known from perturbation theory (see e.g. Bernardeau 
et a/., 2002), in the linear and mildly non-linear regime, 
2LPT correctly describes the one, two and three-point 
statistics of the matter distribution and also approxi¬ 
mates very well higher-order correlators. It accounts in 
particular for tidal effects in its regime of validity. Con¬ 
sequently, the BORG algorithm correctly transports the 
observational information corresponding to complex web¬ 
like features from the final density field to the correspond¬ 
ing initial conditions. Note that such an explicit Bayesian 
forward-modeling approach is always more powerful than 
constraining (part of) the sequence of correlation func¬ 
tions, as it accounts for the entire dark matter dynamics 
(in particular for the infinite hierarchy of correlators), 
in its regime of validity. This is of particular impor¬ 
tance, since the hierarchy of correlation functions has 
been shown to be an insufficient description of density 


fields in the non-linear regime (Carron, 2012; Carron & 
Neyrinck, 2012). 

As discussed in BORG SDSS, our analysis comprehen¬ 
sively accounts for observational effects such as selection 
functions, survey geometry, luminosity-dependent galaxy 
biases and noise. Corresponding uncertainty quantifica¬ 
tion is provided by sampling from the high-dimensional 
posterior distribution via an efficient implementation of 
the Hamiltonian Markov Chain Monte Carlo method 
(see Jasche & Wandelt, 2013, for details). In partic¬ 
ular, luminosity-dependent galaxy biases are explicitly 
part of the BORG likelihood and the bias amplitudes are 
inferred self-consistently during the run. Though not ex¬ 
plicitly modeled, redshift-space distortions are automat¬ 
ically mitigated: due to the prior preference for homo¬ 
geneity and isotropy, such anisotropic features are treated 
as noise in the data. 

In the following, we make use of the 12,000 samples of 
the posterior distribution for primordial density fields, 
obtained in BORG SDSS. These reconstructions, con¬ 
strained by SDSS observations, act as initial conditions 
for the generation of constrained large-scale structure re¬ 
alizations. It is important to note that we directly make 
use of BORG outputs without any further post-processing, 
which demonstrates the remarkable quality of our infer¬ 
ence results. 


B. Non-linear filtering of samples with COLA 

Leclercq et al. (2013, section 2.A) performed a study 
of differences in the representation of structure types in 
density fields predicted by LPT and 7V-body simulations. 
To do so, they used the same web-type classification pro¬ 
cedure as in this work (see section IIC). In spite of the 
visual similarity of LPT and 7V-body density fields at 
large and intermediate scales (above a few Mpc/h), they 
found crucial differences in the representation of struc¬ 
tures. Specifically, LPT predicts fuzzier halos than full 
gravity, and incorrectly assigns the surroundings of voids 
as part of them. This manifests itself in an overpredic¬ 
tion of the volume occupied by clusters and voids at the 
detriment of sheets and filaments. The substructure of 
voids is also known to be incorrectly represented in 2LPT 
(Neyrinck, 2013; Leclercq et al ., 2013). 

For these reasons, in this work we cannot directly make 
use of the final BORG density samples, which are a predic¬ 
tion of the 2LPT model. Instead, we rely on the inferred 
initial conditions, which contain the data constraints (as 
described in BORG SDSS) and on a non-linear filtering 
step similar to the one described in Leclercq et al. (2015, 
section 2.A). Due to the large number of samples to 
be processed for this work, we do not use a fully non¬ 
linear simulation code as in Leclercq et al. (2015), but 
the COLA method (COmoving Lagrangian Acceleration, 
Tassev, Zaldarriaga & Eisenstein, 2013). 

The initial density field, defined on a cubic equidis¬ 
tant grid with side length of 750 Mpc /h and 256 3 voxels, 
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FIG. 1. Power spectra of dark matter density fields at redshift 
zero, computed with a mesh size of 3 Mpc /h. The particle 
distributions are determined using: unconstrained 2LPT real¬ 
izations (“2LPT, prior”), constrained 2LPT samples inferred 
by BORG (“2LPT, posterior”), unconstrained COLA realiza¬ 
tions (“COLA, prior”), constrained samples inferred by BORG 
and filtered with COLA (“COLA, posterior”). The solid lines 
correspond to the mean among all realizations used in this 
work, and the shaded regions correspond to the 2 -a credible 
interval estimated from the standard error of the mean. The 
dashed black curve represents Pnl(&), the theoretical power 
spectrum expected at z = 0 from high-resolution iV-body sim¬ 
ulations. 


is populated by 512 3 particles placed on a regular La- 
grangian lattice. The particles are evolved with 2LPT 
to the redshift of z = 69 and with COLA from z = 69 
to z = 0. The final density field is constructed by bin¬ 
ning the particles with a cloud-in-cell (CiC) method on 
a 256 3 -voxel grid. This choice corresponds to a resolu¬ 
tion of around 3 Mpc /h for all the maps described in this 
paper. In this fashion, we generate a large set of data- 
constrained reconstructions of the present-day dark mat¬ 
ter distribution (see also Lavaux, 2010; Kitaura, 2013; 
HeB, Kitaura & Gottlober, 2013; Nuza et al ., 2014). To 
ensure sufficient accuracy, 30 timesteps logarithmically- 
spaced in the scale factor are used for the evolution with 
COLA. We checked that this setup yields vanishing differ¬ 
ence in the representation of final density fields with re¬ 
spect to the prediction of gadget-2 (Springel, Yoshida 
& White, 2001; Springel, 2005). Therefore, COLA en¬ 
ables us to cheaply generate non-linear density fields at 
the required accuracy. 

As an illustration of the improvement introduced by 
non-linear filtering at the level of two-point statistics, in 
figure 1 , we plot the power spectra of redshift-zero dark 
matter density fields. The agreement between uncon¬ 
strained and constrained realizations at all scales can be 
checked. The plot also shows that our set of constrained 


reconstructions contain the additional power expected 
in the non-linear regime 1 , up to k ~ 0.4 (Mpc//i) _1 . 
For a visual illustration of the non-linear filtering proce¬ 
dure, which permits to check the phases of corresponding 
2LPT and COLA fields, the reader is referred to figure 2 
in Leclercq et al. (2015). The final density field predicted 
by COLA is visually indistinguishable from the right panel 
there, corresponding to the gadget-2 result. 

C. Classification of the cosmic web 

The BORG filtered reconstructions permit a variety of 
scientific analyses of the large scale structure in the ob¬ 
served Universe. In this work, we focus specifically on 
the possibility to characterize the cosmic web by distinct 
structure types. Generally, any of the methods cited 
in the introduction can be employed for analysis of our 
density samples, however for the purpose of this paper, 
we follow the dynamical cosmic web classification pro¬ 
cedure as proposed by Hahn et al. (2007a). In analogy 
with the Zel’dovich (1970) theory, they propose to clas¬ 
sify the large scale structure environment into four web 
types (voids, sheets, filaments, and clusters) based on 
a local-stability criterion for the orbits of test particles. 
The basic idea of this dynamical classification approach 
is that the eigenvalues of the tidal tensor characterize the 
geometrical properties of each point in space. The tidal 
tensor 2 is given by the Hessian of the gravitational 
potential 4>, 

< 9 2 4 > 

Tij = ’ (h 

with being the rescaled gravitational potential given by 
the reduced Poisson equation (see appendix A in Forero- 
Romero et al ., 2009), 

V 2 $ = (5. ( 2 ) 

With these definitions, the three eigenvalues Ai < A 2 < 
A 3 of the tidal tensor form a decomposition of the den¬ 
sity contrast field, in the sense that the trace of is 
Ai + A 2 + A 3 = S. Each spatial point can then be classi¬ 
fied as a specific web type by considering the signs of Ai, 
A 2 , A 3 . Namely, a void point corresponds to no positive 
eigenvalue, a sheet to one, a filament to two and a cluster 
to three positive eigenvalues (Hahn et al ., 2007a). The 
interpretation of this rule is straightforward, as the sign 


1 Note that the lack of small scale power in COLA with respect to 
theoretical predictions, for k > 0.5 (Mpc/h) -1 , is a gridding ar¬ 
tifact due to the finite mesh size used for the analysis. This value 
corresponds to around one quarter of the Nyquist wavenumber. 

2 We follow the nomenclature of Hahn et al. (2007a); Hoffman 
et al. (2012), who call Tij (including its trace part) the tidal 

tensor ; it is called the deformation tensor by Forero-Romero 
et al. (2009). 
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of an eigenvalue at a given position defines whether the 
gravitational force in the direction of the corresponding 
eigenvector is contracting (positive eigenvalues) or ex¬ 
panding (negative eigenvalues). 

Several extensions of this classification procedure exist. 
Forero-Romero et al. (2009) argued that rather than us¬ 
ing a threshold value A t h of zero, different positive values 
can yield better web classifications down to the mega- 
parsec scale. Hoffman et al. (2012) reformulated the 
procedure using the velocity shear tensor (the “V-web”) 
instead of the gravitational tidal tensor (the “T-web”). 
They showed that the two classifications coincide at large 
scales and that the velocity field resolves finer structure 
than the gravitational field at the smallest scales (sub¬ 
megaparsec). In this work, we will probe scales down 
to ^ 3 Mpc jh (the voxel size in our reconstructions). 
Therefore, we will be content with the original classifica¬ 
tion procedure as proposed by Hahn et al. (2007a). The 
structures are then classified according to the rules given 
in table I. 

It is important to note that the tidal tensor and the 
rescaled gravitational potential are both physical quanti¬ 
ties, and hence their calculation requires the availability 
of a full physical density field in contrast to a smoothed 
mean reconstruction of the density field. As described in 
BORG SDSS, density samples obtained by the BORG algo¬ 
rithm provide such required full physical density fields. 
The tidal tensor can therefore easily be calculated in each 
density sample from the Fourier space representations of 
eq. (1) and (2) (see Hahn et al ., 2007a; Forero-Romero 
et al ., 2009, for details on the technical implementation). 

The aforementioned web classifiers (Hahn et al., 2007a; 
Forero-Romero et al ., 2009; Hoffman et al ., 2012) pro¬ 
vide four voxel-wise scalar fields that characterize the 
large scale structure. In a specific realization, the answer 
is unique, meaning that these fields obey the following 
conditions at each voxel position x^\ 

3 

T i(x k ) e {0,1} for i e [0,3] and ^ T^) = 1 (3) 

i =0 

where To = void, Ti = sheet, T 2 = filament, T 3 = 
cluster. In this work, we follow the Bayesian approach of 
BORG SDSS and quantify the degree of belief in structure 
type classification. Specifically, our web classification is 
given in terms of four voxel-wise scalar fields that obey 
the following conditions at each voxel position x 

3 

Ti{x k ) e [0,1] for i e [0, 3] and ^ %{x k ) = 1. (4) 

i =0 

Here, %{x k ) = (T i{x k )) v{Ti{Sk) \ d ) = V{Ti{x k )\d) are the 
posterior probabilities indicating the possibility to en¬ 
counter specific structure types at a given position in the 
observed volume, conditional on the data. These are es¬ 
timated by applying the web classification to all density 
samples and counting the relative frequencies at each in¬ 
dividual spatial coordinate within the set of samples (see 


Structure type Rule 

Void 

Ai, A 2 , A 3 < 0 

Sheet 

Ai, A 2 < 0 and A 3 > 0 

Filament 

Ai < 0 and A 2 , A 3 > 0 

Cluster 

Ai, A 2 , A 3 > 0 


TABLE I. Rules for the dynamic classification of web types. 

section 5 in Jasche et al ., 2010b). With this definition, 
the cosmic web-type posterior mean is given by 

1 N 3 

(TOW) = jvEE A(k) Ty(x fc )’ ( 5 ) 
n=1j =0 

where n labels one of the N samples, T™ (x^) is the result 
of the web classifier on the n-th sample (i.e. a unit four- 
vector at each voxel position x & containing zeros except 
for one component, which indicates the structure type), 
and is a Kronecker symbol. 

III. THE LATE-TIME LARGE-SCALE STRUCTURE 

In this section, we discuss the results of our analysis 
of the final density field, at a = 1. For reasons of com¬ 
putational time with COLA filtering (see section IIB), we 
kept around 10 % of the original set of samples obtained in 
BORG SDSS. In order to mitigate as much as possible the 
effects of correlation among samples, we maximally sep¬ 
arated the samples kept for the present analysis, keeping 
one out of ten consecutive samples of the original Markov 
Chain. Hence, for all results discussed in this section, we 
used a total of 1,097 samples inferred by BORG and fil¬ 
tered with COLA. 


A. Tidal environment 

As a natural byproduct, the application of the Hahn 
et al. web classifier to density samples yields samples of 
the probability distribution functions for the three eigen¬ 
values of the tidal field tensor. These pdfs account for 
the assumed physical model of structure formation and 
the data constraints, and quantify uncertainty coming in 
particular from selection effects, surveys geometries and 
galaxy biases. In a similar fashion as described in BORG 
SDSS, the ensemble of samples permits us to provide any 
desired statistical summary such as mean and variance. 

In figure 2, we show slices 3 through the ensemble mean 
fields Ai, A 2 and A 3 . For visual comparison, the right¬ 
most panel of figure 2 shows the corresponding slice 


3 In all slice plots of this paper, we kept the coordinate system of 

BORG SDSS. 
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FIG. 2 . Slices through the three-dimensional ensemble posterior mean for the eigenvalues Ai < A 2 < A 3 of the tidal field tensor 
in the final conditions, estimated from 1,097 samples. The rightmost panel shows the corresponding slice through the posterior 
mean for the final density contrast S = Ai + A 2 + A 3 , obtained in Jasche, Leclercq & Wandelt (2015). 



2 [Mpc//i] 

FIG. 3. Slices through the posterior mean for different structure types (from left to right: void, sheet, filament, and cluster) 
in the late-time large-scale structure in the Sloan volume (a = 1). These four three-dimensional voxel-wise pdfs sum up to one 
on a voxel basis. 
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through the posterior mean of the final density contrast, 
S = Ai + A 2 + A 3 , obtained in BORG SDSS. Different mor¬ 
phologies can be observed in the data-constrained parts 
of these slices: Ai, A 2 and A 3 respectively trace well the 
clusters, filaments and sheets, as we now argue. The Ai 
field is rather homogeneous, apart for small spots where 
all eigenvalues are largely positive, i.e. undergoing dra¬ 
matic gravitational collapse along three axes. These cor¬ 
respond to the dynamic clusters. Note that there exists 
a form of “tidal compensation”: these clusters are sur¬ 
rounded by regions where Ai is smaller than its cosmic 
mean. More patterns can be observed in the A 2 field: 
it also exhibits filaments (appearing as dots when pierc¬ 
ing the slice). Finally, the A 3 field is highly-structured, 
as it also traces sheets (which appear filamentary when 
sliced). Dynamic voids can also be easily distinguished 
in this field, wherever A 3 is negative. 


B. Probabilistic web-type cartography 

Building upon previous results and using the procedure 
described in section IIC, we obtain probabilistic maps of 


structures. More precisely, we obtain four probability 
distributions at each spatial position, V(Ti(x*k)\d), in¬ 
dicating the possibility to encounter a specific structure 
type (cluster, filament, sheet, void) at that position. As 
noted in section IIC, these pdfs take their values in the 
range [0,1] and sum up to one on a voxel-basis. Figure 3 
shows slices through their means (see equation (5)). The 
plot shows the anticipated behavior, with a high degree 
of structure and values close to certainty (i.e. zero or one) 
in regions covered by data, while the unobserved regions 
approach a uniform value corresponding to the prior. 
At this point, it is worth noting that the Hahn et al. 
web classifier has a prior preference for some structure 
types. Using unconstrained large-scale structure realiza¬ 
tions produced with the same setup 4 , we measured that 
these prior probabilities, V(T *), can be well described by 


4 By this, we specifically mean realizations obtained from initial 
randomly-generated Gaussian density fields with an Eisenstein & 
Hu (1998, 1999) power spectrum using the fiducial cosmological 
parameters of the BORG analysis (Q m = 0.272, — 0.728, 

Qb — 0.045, h = 0.702, as = 0.807, n s = 0.961, see 
BORG SDSS) . The density field is defined on a 750 Mpc /h cubic 





















Structure type 

PV(T i) 

&V(Ti) 

Late-time large-scale structure (o = 1) 

Void 

0.14261 

6.1681 x 10“ 4 

Sheet 

0.59561 

6.3275 x 10“ 4 

Filament 

0.24980 

5.5637 x 10“ 4 

Cluster 

0.01198 

5.8793 x 10“ 5 


TABLE II. Prior probabilities assigned by the Hahn et al. 
(2007a) web classifier to the different structures types, in the 
late-time large-scale structure (a = 1). 

Gaussians whose mean and standard deviation are given 
in table II. 

In addition to their ensemble mean, the set of samples 
permits to propagate uncertainty quantification to web- 
type classification. In particular, it allows us to locally 
assess the strength of data constraints. In information 
theory, a convenient way to characterize the uncertainty 
content of a random source S is the Shannon entropy 
(Shannon, 1948), defined by 

H i 5 ] = - XT lo &2 (?*), (6) 

i 

where the pi are the probabilities of possible events. This 
definition yields expected properties and accounts for the 
intuition that the more likely an event is, the less infor¬ 
mation it provides when it occurs (i.e. the more it con¬ 
tributes to the source entropy). We follow this prescrip¬ 
tion and write the voxel-wise entropy of the web-type 
posterior, V(T(xk)\d), as 

3 

H[P(T(x k )\d)] = -Y / 'P(T i (x k )\d)\og 2 (T(T i (x k )\d)). 

i=0 

( 7 ) 


It is a number in the range [0, 2] and its natural unit is the 
shannon (Sh). H = 0 Sh in the case of perfect certainty, 
i.e. when the data constraints entirely determine the 
underlying structure type: V(Ti 0 (xk)\d) is 1 for one io 
and 0 for i ^ io- H reaches its maximum value of 2 Sh 
when all V(Ti(xk)\d) are equal to 1/4. This is the case of 
maximal randomness: all the events being equally likely, 
no information is gained when one occurs. 

A slice through the voxel-wise entropy of the web-type 
posterior is shown in the left panel of figure 4. Generally, 
the entropy map reflects the information content of the 
posterior pdf, which comes from augmenting the informa¬ 
tion content of the prior pdf with the data constraints, 
in the Bayesian way. 

The entropy takes low values and shows a high degree 
of structure in the regions where data constraints exist, 
and even reaches zero in some spots where the data are 
perfectly informative. Comparing with figures 2 and 3, 
one can note that this structure is highly non-trivial and 
does not follow any of the previously described patterns. 
This is due to the facts that in a Poisson process, the 
signal (here the density, inferred in BORG SDSS) is corre¬ 
lated with the uncertainty and that structure types clas¬ 
sification further is a non-linear function of the density 
field. In the unobserved regions, the entropy fluctuates 
around a constant value of about 1.4 Sh, which charac¬ 
terizes the information content of the prior. This value is 
consistent with the expectation, which can be computed 
using equation (7) (unconditional on the data) and the 
numbers given in table II. 

The information-theoretic quantity that measures the 
information gain (in shannons) due to the data is the rel¬ 
ative entropy or Kullback-Leibler divergence (Kullback & 
Leibler, 1951) of the posterior from the prior, 


3 3 

Dkl tP(T(x k )\d)\\r(T)] = EA(Tj(xfc)|<i) log2 = _ H [p(T(x k )\d)} - £ V{Ti{Z k )\d) log^T*)). 

2 = 0 ' V / 2=0 

(8) 


It is a non-symmetric measure of the difference between 
the two probability distributions. 

A slice through the voxel-wise Kullback-Leibler diver¬ 
gence of the web-type posterior from the prior is shown 
in the right panel of figure 4. As expected, the informa¬ 
tion gain is zero out of the survey boundaries. In the 


observed regions, SDSS galaxies are informative on un¬ 
derlying structure types at the level of at least ^ 1 Sh. 
This number can go to ~ 3 Sh in the interior of deep 
voids and up to ~ 6 Sh in the densest clusters. This map 
permits to visualize the regions where additional data 
would be needed to improve structure type classification, 
e.g. in some high-redshift regions where uncertainty re¬ 
mains due to selection effects. 


grid of 256 3 -voxels and populated by 512 3 dark matter particles, 
which are evolved to z = 69 with 2LPT and from z = 69 to 
z = 0 with COLA, using 30 timesteps logarithmically-spaced in 
the scale factor. The particles are binned on a 256 3 -voxels grid 
with the CiC scheme to get the final density field. 
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FIG. 4. Slices through the entropy of the structure types posterior (left panel) and the Kullback-Leibler divergence of the 
posterior from the prior (right panel), in the final conditions. The entropy 17, defined by equation (7), quantifies the information 
content of the posterior pdf represented in figure 3, which results from fusing the information content of the prior and the data 
constraints. The Kullback-Leibler divergence Dkl, defined by equation (8), represents the information gained in moving from 
the prior to the posterior. It quantifies the information that has been learned on structure types by looking at SDSS galaxies. 


C. Volume and mass filling fractions 

A characterization of large scale environments com¬ 
monly found in literature involves evaluating global quan¬ 
tities such as the volume and mass content of these struc¬ 
tures. In a particular realization, the volume filling frac¬ 
tion (VFF) for structure type T^ is the number of voxels 
of type T^ divided by the total number of voxels in the 
considered volume, 


V V 3 

z^x k z^j=0 u T i (x k )T ri (x k ) 

VFF(Ti) =_-_-_ 3 . (9) 

Ayox 

The mass filling fraction (MFF) can be obtained in a 
similar manner by weighting the same sum by the local 
density p{x k ) = p (1 + S(x k )), 


MFF(Ti) 


V V 3 


(1 + S(x k ))S^J s ^ T n( 5 ? fc ) 




( 10 ) 


To ensure that results are not prior-dominated, we mea¬ 
sured the VFFs and MFFs in the data-constrained parts 
of our realizations. More precisely, we limited ourselves 
to the voxels where the survey response operator (rep¬ 
resenting simultaneously the survey geometry and the 


selection effects, see BORG SDSS) is strictly positive. This 
amounts to N vox = 3,148,504 out of 256 3 = 16,777,216 
voxels, around 18.7% of the full box (see also section 
II.C.2. and figure 3 in Leclercq et al ., 2015). In equa¬ 
tions (9) and (10), x & labels one of these voxels. 

By measuring the VFF and MFF of different struc¬ 
ture types in each constrained realization of our ensem¬ 
ble, we obtained the posterior pdfs, V(VFF(Ti)\d) and 
, P(MFF(T^)|d), conditional on the data. Similarly, we 
computed the prior pdfs, V(VFF(Ti)) and P(MFF(T^)), 
using unconstrained realizations produced with the same 
setup. We found that all these pdfs can be well described 
by Gaussians, the mean and variance of which are given 
in tables III and IV. 

Previous studies on this topic (e.g. Doroshkevich, 1970; 
Shen et al ., 2006; Hahn et al ., 2007a; Forero-Romero 
et al ., 2009; Jasche et a/., 2010b; Aragon-Calvo, van de 
Weygaert & Jones, 2010; Shandarin, Habib & Heitmann, 
2012; Cautun et al ., 2014) have found a wide range of 
values for the VFF and MFF of structures (see e.g. table 
3 in Cautun et al ., 2014). For example, existing stud¬ 
ies found that clusters occupy at most a few percent of 
the volume of the Universe but contribute significantly 
to the mass content, with a MFF ranging from ~ 10% 
(Hahn et a/., 2007a; Cautun et al ., 2014) to ~ 40% 
(Shandarin, Habib & Heitmann, 2012). The void volume 
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Structure type [avff ctvff A^vff ctvff 



Late-time large-scale structure (a = 1) 


Posterior 

Prior 

Void 

0.14897 1.8256 x 10“ 3 

0.14254 6.2930 x 10“ 3 

Sheet 

0.58914 1.3021 x 10“ 3 

0.59562 2.2375 x 10 -3 

Filament 

0.24689 1.1295 x 10“ 3 

0.24986 4.4440 x 10 -3 

Cluster 

0.01499 8.7274 x 10“ 5 

0.01198 1.9194 x 10“ 4 


TABLE III. Mean and standard deviation of the prior and 
posterior pdfs for the volume filling fraction of different struc¬ 
ture types in the late-time large-scale structure (a = 1 ). 


fraction can vary from ~ 10% (Hahn et al ., 2007a) to 
~ 80% (Aragon-Calvo, van de Weygaert & Jones, 2010; 
Shandarin, Habib & Heitmann, 2012 ; Cautun et al ., 
2014); in the Forero-Romero et al. (2009) formalism, it 
is a very sensitive function of the threshold A t h (figure 
9 in Jasche et al ., 2010b). These large disparities in the 
literature arise because different algorithms use various 
information and criteria for classifying the cosmic web. 
For this reason, we believe that it is only relevant to make 
relative statements for the same setup, i.e. to compare 
our results to the corresponding prior quantities, as done 
in tables III and IV. In this purpose, the large number 
of samples used allowed a precise characterization of the 
pdfs so that all digits quoted in the tables are significant. 
Note that all our analyses are repeatable for different se¬ 
tups, which allows in principle a comparison with any 
previous work. 

As expected for a Bayesian update of the degree of be¬ 
lief, the posterior quantities generally have smaller vari¬ 
ance and a mean value displaced from the prior mean. 
For the MFF, the posterior means are always within two 
standard deviations of the corresponding prior means. 
The analysis shows that in the SDSS, a larger mass frac¬ 
tion is occupied by clusters, sheets, and voids, at the 
detriment of filaments, in comparison to the prior ex¬ 
pectation. The data also favor a smaller filling of the 
Sloan volume by filaments and sheets and larger filling 
by voids and clusters. For the cluster VFF, the posterior 
mean, /ivFF(T 3 )|d = 0.01499 is at about 15 standard de¬ 
viations (cr v ff(t 3 ) = 1-9194 x 10 -4 ) of the prior mean, 
Mvff(t 3 ) = 0.01198. Given other results on the VFF 
and MFF, we believe that the data truly favor a higher 
volume content in clusters as compared to the structure 
formation model used as prior. However, this surprising 
result should be treated with care; part of the discrep¬ 
ancy is likely due to the original BORG analysis, which 
optimizes the initial conditions for evolution with 2LPT 
(instead of the non-linear evolution with COLA used for 
the present work). LPT predicts fuzzier halos than N- 
body dynamics, which results in the incorrect prediction 
of a high cluster VFF (Leclercq et al ., 2013). 


Structure type /xmff ctmff /^mff ctmff 



Late-time large-scale structure (a = 1) 


Posterior 

Prior 

Void 

0.04050 8.3531 x 10“ 4 

0.03876 2.3352 x 10“ 3 

Sheet 

0.35605 1.2723 x 10“ 3 

0.35286 3.6854 x 10“ 3 

Filament 

0.47356 1.5661 x 10“ 3 

0.48170 4.2215 x 10 -3 

Halo 

0.12990 6.4966 x 10“ 4 

0.12666 1.8284 x 10“ 3 


TABLE IV. Mean and standard deviation of the prior and 
posterior pdfs for the mass filling fraction of different struc¬ 
ture types in the late-time large-scale structure (a = 1 ). 


IV. THE PRIMORDIAL LARGE-SCALE STRUCTURE 

In this section, we discuss the results of our analysis of 
the initial density field, at a = 10 -3 . Since the analysis 
of the primordial large-scale structure does not involve 
an additional filtering step, we have been able to keep a 
larger number of samples of the posterior pdf for initial 
conditions, obtained in BORG SDSS. Hence, for all results 
described in this section, we used a total of 4,473 samples. 


A. Tidal environment 

In a similar fashion as in section III A, the application 
of the Hahn et al. web classifier to initial density sam¬ 
ples yields the posterior pdf for the three eigenvalues, 
Ai, A 2 and A3, of the initial tidal field tensor. Figure 
5 shows slices through their means. For visual compar¬ 
ison, the rightmost panel shows the corresponding slice 
through the posterior mean of the initial density contrast, 
5 = Ai + A 2 + A3, obtained in BORG SDSS. 

In a Gaussian random field, Ai is generally negative, 
A 3 is generally positive and A 2 close to zero (see the un¬ 
observed parts of the slices in figure 5). In addition, 
A 2 closely resembles the total density contrast S up to a 
global scaling. In the constrained regions, the eigenval¬ 
ues of the initial tidal tensor follow this behavior. The 
structure observed in their maps is visually consistent 
with the decomposition of Gaussian density fluctuations 
as shown by the right panel. 


B. Probabilistic web-type cartography 

Looking at the sign of the eigenvalues of the initial tidal 
tensor and following the procedure described in section 
IIC, we obtain a probabilistic cartography of the pri¬ 
mordial large-scale structure. As before, we obtain four 
voxel-wise pdfs V(Ti(x*k)\d), taking their values in the 
range [0,1] and summing up to one. Figure 6 shows slices 
through their means, defined by equation (5). As in the 
final conditions, the maps exhibit structure in the data- 
constrained regions and approach uniform values in the 
unobserved parts, corresponding to the respective pri- 
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FIG. 5. Slices through the three-dimensional ensemble posterior mean for the eigenvalues Ai < A 2 < A 3 of the tidal field 
tensor in the initial conditions, estimated from 4,473 samples. The rightmost panel shows the corresponding slice through the 
posterior mean for the initial density contrast S = Ai + A 2 + A 3 , obtained in Jasche, Leclercq V Wandelt (2015). 
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FIG. 6. Slices through the posterior mean for different structure types (from left to right: void, sheet, filament, and cluster) in 
the primordial large-scale structure in the Sloan volume (a = 10 -3 ). These four three-dimensional voxel-wise pdfs sum up to 
one on a voxel basis. 


ors. Using unconstrained realizations of Gaussian ran¬ 
dom fields produced with the same setup 5 , we measured 
these prior probabilities. Their means and standard de¬ 
viations are given in table V. 

At this point, it is worth mentioning that there ex¬ 
ists an additional symmetry for Gaussian random fields. 
Since the definition of the tidal tensor is linear in the den¬ 
sity contrast (see equations (1) and (2)) and since pos¬ 
itive and negative density contrasts are equally likely, a 
positive and negative value for a given A^ have the same 
probabilities. Because of this sign symmetry, the pdfs 
for voids and clusters (0 or 3 positive/negative eigenval¬ 
ues) and the pdfs for sheets and filaments (1 or 2 posi¬ 
tive/negative eigenvalues) are equal. This can be checked 
both in table V and in the unconstrained regions of the 
maps in figure 6. In the constrained regions, a qualitative 
complementarity between pdfs for voids and clusters and 
for sheets and filaments can be observed. This can be 
explained by the following. As JA V(Ti(x*k)\d) = 1 and 


5 We used the initial conditions of our set of unconstrained simu¬ 
lations (see footnote 4). 


Structure type 

/Mt i) 

a V(Ti) 

Primordial large-scale structure (a = 10 3 ) 

Void 

0.07979 

5.4875 x 10 -5 

Sheet 

0.42022 

1.0240 x 10“ 4 

Filament 

0.42022 

1.0412 x 10 -4 

Cluster 

0.07978 

5.6337 x 10“ 5 


TABLE V. Prior probabilities assigned by the Hahn et al. 
(2007a) web classifier to the different structures types, in the 
primordial large-scale structure (a = 10 -3 ). 


assuming that V(Ti(x*k)\d) ~ V(Ts-i(ock)\d) for unlikely 
events, consistently with the previous remark, we get 
‘P(T 0 (x k )\d) « 1 - V(T 3 (xk)\d) wherever V^^Xk^d) « 
V(T 2 (xk)\d) is sufficiently small; and 'P(T-| (xk)\d) ~ 
l-V(T 2 (x k )\d) wherever V{T 0 (x k )\d) ps V(T 3 (x t k )\ d ) is 
sufficiently small. These results are therefore consistent 
with expectations based on Gaussianity for the primor¬ 
dial large-scale structure in the Sloan volume. 

In a similar fashion as in section III B, the ensemble of 
samples permits us to propagate uncertainties to struc¬ 
ture type classification and to characterize the strength 
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Structure type [avff ctvff A^vff ctvff 



Primordial large-scale structure (a — 10 3 ) 


Posterior 

Prior 

Void 

0.07994 4.0221 x 10“ 4 

0.07977 1.0200 x 10“ 3 

Sheet 

0.41994 6.1770 x 10“ 4 

0.42019 1.7885 x 10 -3 

Filament 

0.42048 6.3589 x 10“ 4 

0.42024 1.7820 x 10 -3 

Cluster 

0.07964 3.8043 x 10“ 4 

0.07980 1.0260 x 10“ 3 


TABLE VI. Mean and standard deviation of the prior and 
posterior pdfs for the volume filling fraction of different struc¬ 
ture types in the primordial large-scale structure (a = 10 -3 ). 

of data constraints. In the left panel of figure 7, we show 
a slice through the voxel-wise entropy of the web-type 
posterior pdf in the initial conditions, defined by equa¬ 
tion (7). This function quantifies the information content 
of the posterior, which comes from both the prior and the 
data constraints. As in the final conditions, the entropy 
takes lower values inside the survey region. In the unob¬ 
served parts, the entropy fluctuates around 1.6 Sh, value 
which characterizes the information content of the prior. 
Using equation (7) (unconditional on the data) and the 
numbers given in table V, one can check that this num¬ 
ber is consistent with the expectation. In the right panel 
of figure 7, we show a map of the Kullback-Leibler diver¬ 
gence of the posterior from the prior, which represents 
the information gain due to the data. 


C. Volume and mass filling fractions 

We computed the volume and mass filling fractions 
(defined by equations (9) and (10)) of different struc¬ 
ture types in the primordial large-scale structure in the 
Sloan volume. As for the final conditions, we kept only 
the regions where the survey response operator is strictly 
positive. Consequently, we obtained the posterior pdfs 
P(VFF(Ti)|d) and P(MFF(T i )|d). Using a set of un¬ 
constrained Gaussian random fields, we also measured 
P(VFF(T i)) and 7>(MFF(T*)) and found that all these 
pdfs are well described by Gaussians, the means and stan¬ 
dard deviations of which are given in table VI and VII. 

All posterior quantities obtained are within two stan¬ 
dard deviations of the corresponding prior means, and 
show smaller variance, as expected. Hence, all results 
obtained are consistent with Gaussian initial conditions. 


V. EVOLUTION OF THE COSMIC WEB 

In addition to the inference of initial and final den¬ 
sity fields, BORG allows to simultaneously analyze the 
formation history and morphology of the observed large- 
scale structure, a subject that we refer to as chrono- 
cosmography (borg SDSS). In this section, we discuss 
the evolution of the cosmic web from its origin (a = 10 -3 , 


Structure type /imff ctmff /imff ctmff 



Primordial large-scale structure (a m 10 3 ) 


Posterior 

Prior 

Void 

0.07958 4.0122 x 10“ 4 

0.07941 1.0163 x 10“ 3 

Sheet 

0.41933 6.1907 x 10“ 4 

0.41957 1.7912 x 10 -3 

Filament 

0.42110 6.3543 x 10“ 4 

0.42087 1.7785 x 10 -3 

Cluster 

0.07999 3.8206 x 10“ 4 

0.08015 1.0293 x 10“ 3 


TABLE VII. Mean and standard deviation of the prior and 
posterior pdfs for the mass filling fraction of different struc¬ 
ture types in the primordial large-scale structure (a — 10 -3 ). 


analyzed in section IV) to the present epoch (a = 1, ana¬ 
lyzed in section III). To do so, we use 11 snapshots saved 
during the COLA filtering of our results (see section IIB). 
These are linearly separated in redshift from z = 10 to 
z = 0. We perform this analysis in the 1,097 samples 
filtered with COLA considered in section III. For each of 
these samples and for each redshift, we follow the proce¬ 
dure described in sections IIB and IIC to compute the 
density field and to classify the structure types. 


A. Evolution of the probabilistic maps 

We followed the time evolution of the probabilistic 
web-type maps from the primordial (figure 6) to the late¬ 
time large-scale structure (figure 3). In unconstrained 
regions, these maps show the evolution of the prior pref¬ 
erence for specific structure types (see tables II and V), in 
particular the breaking of the initial symmetry between 
voids and clusters and between sheets and filaments, dis¬ 
cussed in section IV B. 

In data-constrained regions, the time evolution of web- 
type maps permits to visually check the expansion his¬ 
tory of individual regions where the posterior probabil¬ 
ity of one specific structure is high. In particular, it is 
easy to see that, as expected from their dynamical defi¬ 
nition, voids expand and clusters shrink in comoving co¬ 
ordinates, from a = 10 -3 toa = l (the reader is invited 
to compare the leftmost and rightmost panels of figures 3 
and 6). Similarly, regions corresponding with high prob¬ 
ability to sheets and filaments expand along two and one 
axis, respectively, and shrink along the others. This phe¬ 
nomenon is more difficult to see in slices, however, as the 
slicing plane intersects randomly the eigendirections of 
the tidal tensor. 

The time evolution of maps of the web-type posterior 
entropy (absolute and relative to the prior) also exhibit 
some interesting features. There, it is possible to simul¬ 
taneously check the increase of the information content 
of the prior (from H « 1.6 Sh to H « 1.4 Sh) and the 
displacement of observational information operated by 
the physical model. As the large-scale structure forms in 
the Sloan volume, data constraints are propagated and 
the complex structure of the final entropy map (figure 4) , 
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FIG. 7. Slices through the entropy of the structure types posterior (left panel) and the Kullback-Leibler divergence of the 
posterior from the prior (right panel), in the initial conditions. The entropy i7, defined by equation (7), quantifies the 
information content of the posterior pdf represented in figure 6, which results from fusing the information content of the prior 
and the data constraints. The Kullback-Leibler divergence Dkl, defined by equation (8), represents the information gained in 
moving from the prior to the posterior. It quantifies the information that has been learned on structure types by looking at 
SDSS galaxies. 


discussed in section IIIB, takes shape. 

B. Volume filling fraction 

Our ensemble of snapshots allows us to check the time 
evolution of global characterizations of the large-scale 
structure such as the volume and mass filling fractions 
of different structures. As in sections IIIC and IV C, we 
computed these quantities using only the volume where 
the survey response operator is non-zero. In figure 8, 
we plot these VFF as a function of the scale factor. 
There, the solid lines correspond to the pdf means and 
the shaded regions to the 2-a credible intervals, with light 
colors for the priors and dark colors for the posteriors. 

The time variation of the VFF in figure 8 is consis¬ 
tent with the expected dynamical behavior of structures. 
As voids and sheets expand along three and two axes, 
respectively, their volume fraction increases. Here, the 
posterior probabilities are mild updates of this predic¬ 
tion. Conversely, as clusters and filaments shrink along 
three and two axes, respectively, their volume fraction de¬ 
creases. An explanation for the substantial displacement 
of the posterior from the prior, observed for clusters, can 
be found in section IIIC. 


As already noted, the VFF is a very sensitive function 
of the precise definition of structures, grid size, density 
assignment scheme, smoothing scale, etc. For this reason, 
even for prior probabilities, our results can be in quali¬ 
tative disagreement with previous authors (e.g. figure 23 
in Cautun et a/., 2014), due to their very different defini¬ 
tions of structures. Therefore, we only found relevant to 
compare our posterior results with the prior predictions 
based on unconstrained realizations. The same remark 
applies to the MFF in the following section. 

C. Mass filling fraction 

In figure 9, we show the time evolution of the mass 
filling fractions using the same plotting conventions. Re¬ 
sults are consistent with an interpretation based on large 
scale flows of matter. According to this picture, voids 
always loose mass while clusters always become more 
massive. The behavior of sheets and filaments can in 
principle be more complex, since these regions have both 
inflows and outflows of matter depending on the detail of 
their expansion profiles. In our setup, we found that the 
number of axes along which there is expansion dominates 
in the determination of the balance of inflow versus out- 
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FIG. 8. Time evolution of the volume filling fractions of different structure types (from left to right and top to bottom: clusters, 
filaments, sheets, voids). The solid lines show the pdf means and the shaded regions are the 2 -a credible intervals. Light colors 
are used for the priors and dark colors for the posteriors. 


flow, for global quantities such as the MFF. Therefore, 
filaments always gain mass and sheets always loose mass. 
Summing up our prior predictions, as they expand along 
at least two axes, matter flows out of voids and sheets 
and streams towards filaments and clusters. 

The posterior probabilities slightly update this picture. 
Observations support smaller outflowing of matter from 
voids. For structures globally gaining matter, the priors 
are displaced towards less massive filaments and more 
massive clusters. All posterior predictions fall within the 
~ 2-cr credible interval from corresponding prior means. 


VI. SUMMARY AND CONCLUSION 

Along with Leclercq et al. (2015), this work exploits 
the high quality of inference results produced by the ap¬ 
plication of the Bayesian code BORG (Jasche & Wandelt, 


2013) to the Sloan Digital Sky Survey main galaxy sam¬ 
ple (borg SDSS). We presented a Bayesian cosmic web 
analysis of the nearby Universe probed by the northern 
cap of the SDSS and its surrounding. In doing so, we 
produced the first probabilistic, four-dimensional maps 
of dynamic structure types in real observations. 

As described in section II A, our method relies on the 
physical inference of the initial density field in the LSS 
(Jasche & Wandelt, 2013; BORG SDSS). Starting from 
these, we generated a large set of data-constrained real¬ 
izations using the fast COLA method (section IIB). The 
use of 2LPT as a physical model in the inference process 
and of the fully non-linear gravitational dynamics, pro¬ 
vided by COLA, as a filter allowed us to describe struc¬ 
tures at the required statistical accuracy, by very well 
representing the full hierarchy of correlation functions. 
Even though initial conditions were inferred with the 
approximate 2LPT model, we checked that the cluster- 
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FIG. 9. Same as figure 8 but for the mass filling fractions. 


ing statistics of constrained non-linear model evaluations 
agree with theoretical expectations up to scales consid¬ 
ered in this work. As described in section IIC, we used 
the dynamic web-type classification algorithm proposed 
by Hahn et al. (2007a) to dissect the cosmic web into 
voids, sheets, filaments, and clusters. 

In sections III and IV, we presented the resulting maps 
of structures in the final and initial conditions, respec¬ 
tively, and studied the distribution of global quantities 
such as volume and mass filling fractions. In section V, 
we further analyzed the time evolution of our results, in 
a rigorous chrono-cosmographic framework. 

For all results presented in this paper, we demon¬ 
strated a thorough capability of uncertainty quantifica¬ 
tion. Specifically, for all inferred maps and derived quan¬ 
tities, we got a probabilistic answer in terms of a prior 
and a posterior distribution. The variation between sam¬ 
ples of the posterior distribution quantifies the remaining 
uncertainties of various origins (in particular noise, selec¬ 
tion effects, survey geometry and galaxy bias, see BORG 


SDSS for a detailed discussion). Building upon our accu¬ 
rate probabilistic treatment, we looked at the entropy of 
the structure type posterior and at the relative entropy 
between posterior and prior. In doing so, we quanti¬ 
fied the information gain due to SDSS galaxy data with 
respect to the underlying dynamic cosmic web and ana¬ 
lyzed how this information is propagated during cosmic 
history. This study constitutes the first link between cos¬ 
mology and information theory using real data. 

In summary, our methodology yields an accurate cos- 
mographic description of web types in the non-linear 
regime of structure formation, permits to analyze their 
time evolution and allows a precise uncertainty quan¬ 
tification in a full-scale Bayesian framework. These in¬ 
ference results can be used for a rich variety of appli¬ 
cations, ranging from studying galaxies inside their en¬ 
vironment to cross-correlating with other cosmological 
probes. They count among the first steps towards accu¬ 
rate chrono-cosmography, the subject of simultaneously 
analyzing the morphology and formation history of the 
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inhomogeneous Universe. 

Note added: As we were finalizing this paper for sub¬ 
mission, the works by Zhao et al. (2015) and Shi, Wang 
& Mo (2015) appeared where the relationship between 
halos and the cosmic web environment defined by the 
tidal tensor is being studied. 
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